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Abstract 

Problem: Practitioners working with multiple-choice tests have long 
utilized Item Response Theory (IRT) models to evaluate the performance 
of test items for quality assurance. The use of similar applications for 
performance tests, however, is often encumbered due to the challenges 
encountered in working with complicated data sets in which local 
calibrations alone provide a poor model fit. 

Purpose: The purpose of this study was to investigate whether the item 
calibration process for a performance test, computer-based case 
simulations (CCS), taken from the United States Medical Licensing 
Examination® (USMLE®) Step 3® examination may be improved through 
explanatory IRT models. It was hypothesized that explanatory IRT may 
help improve data modeling for performance assessment tests by allowing 
important predictors to be added to a conventional IRT model, which are 
limited to item predictors alone. 

Methods: The responses of 767 examinees from a six-item CCS test were 
modeled using the Partial Credit Model (PCM) and four explanatory 
model extensions, each incorporating one predictor variable of interest. 
Predictor variables were the examinees' gender, the order in which 
examinees encountered an individual item (item sequence), the time it 
took each examinee to respond to each item (response time), and 
examinees' ability score on the multiple-choice part of the examination. 
Results: Results demonstrate a superior model fit for the explanatory PCM 
with examinee ability score from the multiple-choice portion of Step 3. 
Explanatory IRT model extensions might prove useful in complex 
performance assessment test settings where item calibrations are often 
problematic due to short tests and small samples. 

Recommendations: Findings of this study have great value in practice and 
implications for researchers working with small or complicated response 
data. Explanatory IRT methodology not only provides a way to improve 
data modeling for performance assessment tests but also enhances the 
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inferences made by allowing important person predictors to be 
incorporated into a conventional IRT model. 

Keywords: Explanatory Item Response Theory, Partial Credit Model, Item 
Response Theory, Performance Tests, Item calibration. Ability estimation. 
Small tests 

Introduction 

Over the past few decades. Item Response Theory (IRT) applications have 
become a vital part of the scoring processes in many large-scale test settings. IRT 
encompasses a family of nonlinear models that provide an estimate of the probability 
of a correct response on a test item as a function of the characteristics of the item (e.g., 
difficulty, discrimination) and the ability level of test takers on the trait being 
measured (e.g., Hambleton, Swaminathan & Rogers, 1991; McDonald, 1999; Skrondal 
& Rabe-Hesketh, 2004). IRT models are particularly appealing in that if the IRT 
model fits the data set, the resulting item and ability parameters can be assumed to 
be sample independent (item and ability parameter invariance property). 
Practitioners working with multiple-choice tests have long utilized IRT models to link 
observable examinee performance on test items to an overall unobservable ability, as 
wells as to evaluate performance of test items and test forms for quality assurance 
(See Hambleton & Van der Linden, 1982, for an overview). 

Applications of IRT models to performance tests, however, have long been 
encumbered by the challenges encountered in modeling novel performance test data. 
Historically, one issue was that the IRT models were developed for dichotomous 
items (Spearman, 1904; Novick, 1966). This made them unsuited for performance 
tests that often had items with ordinal categorical scales (to allow scoring partially 
correct answers). However, extensions for polytomous items (Bock, 1972; Fitzpatrick, 
Link, Yen, Burket, Ito, & Sykes, 1996; Samejima, 1969) soon emerged and solved this 
particular issue. Another issue that remains to date is goodness of model fit. 
Although performance tests with novel item formats are believed to be more suited 
for measuring higher-level examinee abilities (Kane & Mitchell, 1996; Nitko, 1996), 
they are also typically very difficult to model (e.g.. Masters, 1982; Yen, 1983). One 
reason is that performance tests are almost always drastically shorter than their 
multiple-choice counterparts. This makes it very challenging for many performance 
tests to satisfy the demand for large numbers of items for IRT models because it is 
often very expensive to develop and administer performance tests that are as lengthy 
as their multiple-choice counterparts. Another reason is the contextual effects 
introduced by the novelty of test. The influence of various person and test design 
variables is often amplified for performance tests, undermining the goodness of fit 
for the estimated IRT models. To this end, the current study investigates whether an 
alternative IRT modeling approach with added covariates from the generalized 
linear and non-linear mixed modeling framework (Embretson, 1998; De Boeck & 
Wilson, 2004; Wang, Wilson & Shih, 2006) can be used to help improve model 
estimation for a novel performance tests, namely, for computer-based case 
simulations (CCS). 
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Purpose 

The current study was undertaken to investigate whether the item calibration 
process for the CCS examination could be improved using an explanatory IRT 
model. The CCS is a part of the United States Medical Licensing Examination® 
(USMLE®) Step 3 and was introduced in 1999, when the examination transitioned 
from paper-and-pencil administration to computer-based administration. This 
examination uses a small series of computer-based case simulations (CCS items) to 
expose examinees to interactive patient-care simulations; for each simulation they 
must initiate and manage patient care, while receiving patient status feedback and 
managing the simulated time in which the case unfolds (Margolis, Clauser, & Harik, 
2004; Clauser, Harik, & Clyman, 2000). 

The explanatory IRT model application presented in this paper explores the 
usefulness of four different predictor variables in improving the item calibration 
process of the CCS examination: examinees' gender, the order in which each 
individual CCS item was presented during the examination (item sequence), the time 
it took each examinee to respond to each item (response time), and examinees' ability 
score on the multiple-choice part of Step 3. Although only the latter covariate was 
hypothesized to be an important predictor of examinee performance on the CCS, as it 
is the only construct-relevant covariate, the importance of the other variables were 
also tested as potential predictors. The usefulness of item sequence and response 
time were explored, relying on the recent literature that suggests their usefulness as 
predictors of examinee performance (e.g., Ramineni, Harik, Margolis, Clauser, 
Swanson & Dillon, 2007; Lu & Sireci, 2007; Leary & Dorans, 1985; Yen, 1980). The 
importance of the gender variable was tested mainly to investigate whether CCS 
items were easier for one of the gender subgroups. 

A series of alternative explanatory IRT models were estimated using the Partial 
Credit Model (PCM) and with one predictor variable at a time. This resulted in the 
following models: (1) a base model (no covariates), (2) an explanatory model with 
gender effect, (3) an explanatory model with item response time effects, (4) an 
explanatory model with item sequence effects, and (5) an explanatory model with 
examinees' scores on the multiple-choice part of the Step 3 examination (MCT score). 
Table 1 gives an overview of the estimated models. The PCM model with no 
covariates was used as a base model to evaluate the hypothesized improvement in 
model fit for each explanatory model with one added covariate. This "one covariate" 
at a time approach was to ensure that, if and when observed, any improvement in 
model fit is due to the added covariate alone. 
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Table 1. 

Estimated Models and Covariates 



Method 

Data 

Study data included the responses of 767 examinees to a six-item CCS test, each 
of which was administered in random order under standard testing conditions with 
a maximum of 25 minutes of testing time per item. For this analysis, examinee 
responses were coded using a 3-point category scale from 0 to 2, with 2 representing 
maximum credit for a given CCS item. 

Model Estimation 

For dichotomous items, under the Rasch model (Rasch, 1960; Wright, 1997), the 
probability of a positive response (or a correct answer) to item i for person j with 
latent trait 6 is 

* = i|0,>A)= 

(i) 

where /?, is the difficulty of item i. The probability of a person's answering an item 
correctly is, therefore, a function of the difference between the person's ability and 
the difficulty of the item. The person parameters are assumed to be independently 
and normally distributed with a mean of zero and a variance of o 2 . In other words, 
the person parameter is a random effect while the item parameter is a fixed effect. 

The partial credit model (PCM, Masters, 1982) extends the Rasch model for 
binary responses to pairs of adjacent categories in a sequence of ordered responses. 
For an item on an m-point scale, there are m-1 step parameters to estimate. Step 
parameters, § m „ refer to the value of Oj where the probabilities of responding in 
category m and m-1 are equal. For an item with a 3-point scale, the probabilities of 
responding to each of the categories are given by 
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Figure 1 plots category response fimctions for an illustrative CCS item with a 3- 
point scale with fJn =-1 and /?,-/ =1. In the figure, it can be seen that the category 
response functions for categories 0 and 1 intersect at fn while the category response 
functions for categories 1 and 2 intersect at fa- 
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Ability 

Figure 1. Category Response Functions for an illustrative CCS item with a three- 
point scale 


The Partial Credit Model with random effects 

The linear random effects PCM with a person covariate Z ?1 is given by 
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The PCM and the explanatory PCMs were estimated using the PROC NLMIXED 
routine of the Statistical Analysis System (SAS version 9.1.3). For the analysis we 
used a quasi-Newton-Raphson optimization technique and a non-adaptive Gauss- 
Hermite approximation with 10 quadrature points for each dimension (the SAS code 
used in calculations is given at the end of this paper as an appendix). Goodness of 
model fit was evaluated using the -2 log likelihood, the Akaike information criterion 
(AIC) (Akaike, 1974) and the Bayesian information criterion (BIC; Schwarz, 1978), with 
lower values indicating better fit. 


Results 

The results summarized in Table 2 show that by adding the MCT score of 
examinees to the model as a random effect, the PCM model fit was improved, 
producing the lowest -2 log likelihood, AIC, and BIC. There was no improvement 
over the base PCM for the remaining explanatory models with gender, item 
sequence, or item response time as a covariate. Table 3 lists category threshold and 
variance parameter estimates produced by these models. As revealed by the 
observed improvement in the corresponding model fit statistics, MCT score was the 
only significant predictor (0.50) among the four considered. Item sequence, response 
times, and gender effects were all approximately zero (0.01, smaller than 0.001, and 
0.04, respectively). 


Table 2. 


Model Fit Comparisons 


Model 

Number of 
Parameters 

-2 Log 
Likelihood 

AIC 

BIC 

PCM 

13 

8909 

8935 

8996 

PCM with gender 

14 

8908 

8936 

9002 

PCM with response times 

14 

8896 

8924 

8989 

PCM with item sequence 

14 

8906 

8934 

8999 

PCM with MCT score 

14 

8862 

8889 

8955 


* Models with multiple predictors were not feasible for this data set since only the 
MCT score was useful as a predictor among the four considered. 



Eurasian Journal of Educational Research 


123 


Table 3. 

Parameter Estimates 


Parameter Models 



PCM 

PCM with 
gender 

PCM with 
response times 

PCM with item 

sequence 

PCM with 

MCT score 

blcatl 

-0.74 

-0.72 

-1.06 

-0.66 

-1.06 

b2catl 

-0.74 

-0.72 

-1.03 

-0.66 

-0.17 

b3catl 

-1.35 

-1.33 

-1.62 

-1.28 

-0.78 

b4catl 

-0.33 

-0.31 

-0.59 

-0.25 

0.24 

b5catl 

-0.47 

-0.45 

-0.78 

-0.40 

0.10 

b6catl 

-0.97 

-0.95 

-1.25 

-0.90 

-0.40 

blcat2 

-0.86 

-0.84 

-1.16 

-0.79 

-0.29 

b2cat2 

-1.14 

-1.12 

-1.41 

-1.07 

-0.57 

b3cat2 

0.31 

0.32 

0.05 

0.37 

0.87 

b4cat2 

-0.61 

-0.59 

-0.85 

-0.54 

-0.04 

b5cat2 

-0.71 

-0.69 

-1.00 

-0.64 

-0.14 

b6cat2 

0.51 

0.52 

0.25 

0.58 

1.07 

Effect of the 

predictor 

variable 


0.04 

0.00 

0.01 

0.50 

o 2 

0.21 

0.21 

0.23 

0.21 

0.17 


* Standard Error of the estimates ranged between 0.08 and 0.16. 


Figure 2 and Figure 3 plot category response functions for the six CCS items 
using threshold parameters estimated by the base PCM and the best fitting 
explanatory PCM with MCT Score predictor, respectively. Comparing the graphical 
displays of probabilities computed for each response category given in Figure 1 with 
Figure 2 reveals that aiding the base PCM model with MCT score greatly improves 
the functional form of CCS items. 


Discussion 

Explanatory IRT models incorporating item or person covariates are increasingly 
used in many test settings to learn more about predictors of examinee performance 
(e.g., Fischer, 1983; De Boeck & Wilson, 2004; Embretson, 1984; Embretson, 1997) and 
to help improve item calibration and scoring procedures (e.g.. Fox, 2005; Harting, 
Frey, Nold & Klieme, 2012; Zinderman, 1991). The premise of the current paper is 
that they may also be useful in the context of authentic performance assessment tests 
with small tests. This paper demonstrates that explanatory PCMs with meaningful 
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predictors might prove useful in calibrating complex performance tests similar to the 
USMLE CCS, which otherwise could not be calibrated. 

For the CCS application presented in this paper, the meaningfulness of four 
individual predictor variables was tested: examinees' gender, the order in which 
each individual CCS was presented during the examination (item sequence), the time 
it took each examinee to respond to each case (response time) and examinees' ability 
score on the multiple-choice part of Step 3. While only the latter predictor variable 
was found to be of statistical and practical significance, the results nicely illustrate 
how an explanatory approach can be used to investigate the usefulness of individual 
predictor variables in model estimation. Although it was not feasible for the present 
application, as only one of the covariates was found to be of statistical importance, it 
is recommended that researchers explore multivariate model extensions to further 
assess if a more complex model with multiple predictors may further improve model 
fit. 

The findings of this study have great value for researchers and practitioners 
working with small performance tests and complex response data in which local 
calibrations alone provide a poor model fit. Explanatory model extensions of PCM 
not only provide a way to improve data modeling for short performance assessment 
tests but also open other possibilities by allowing various person predictors to be 
added to conventional item response models, which are limited to item predictors 
alone. Future research should investigate the influence of other item and person 
predictors on CCS performance to determine if any can lead to a stronger model fit, 
more stable parameter estimates, or a more precise measure of CCS proficiency. One 
predictor of future interest, for example, could be the examinees' postgraduate 
medical training (Dillon, Henzel & Walsh, 1997; Feinberg, 2012). Examinees who are 
exposed to a broad range of training during their residency or clinical experience 
might perform better on the MC items of Step 3 as compared to examinees who have 
a narrow training focus (Sawhill, Dillon, Ripkey, Hawkins, & Swanson, 2003). 
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Figure 3. Item Characteristics Curves for the six CCS items estimated by the 
explanatory PCM with MCT Scores 
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Aciklayici Madde Tepki Kurammin interaktif Bir Bilgisayar Simiilasyon 

Testine Uygulanmasi 

Atif: 

Kahraman, N. (2014). An explanatory item response theory approach for a computer- 
based case simulation test, Eurasian Journal of Educational Research, 54,117-134. 


Ozet 

Problem: Test geli§tirme ve geli§tirilen testlerin giivenirlik ve gegerligini ara§tirmada 
sikga kullanilan Madde Tepki Modelled goktan segmeli testlerde uzun zamandrr 
madde ve test kalitesini kontrol amaciyla kullanrlmaktadir. Bu modellerin aym 
amac^la uygulamali testlerde kullanimi ise birc^ok zorluk ile kar§ila§mi§tir. Bu 
zorluklardan ilki ilk geli§tirilen Madde Tepki Modellerinin sadece ikili puanlanan 
test maddeleri igin uygun olmasiydi. Oysa uygulamali test maddeleri gogunlukla 
kismi puanlama gerektirecek §ekilde geli§tirilir. Kismi puanlamaya uygun Madde 
Tepki Modellerinin geli§tirilmesiyle bu sorun kisa zaman igerisinde goziimlendi. Bir 
diger zorluk ki hala gtincelligini korumaktadrr, uygulamali test verilerinin Madde 
Tepki Modelled ile modellenmeye daha az uygun olu§landir. Bir ba§ka deyi§le, 
uygulamali testlerde kullanildigmda Madde Tepki Modelleri uygulamalari 
giivenirligi gok iyi olmayan madde ve ki§i istatistikleri ile sonuglanabilmektedir. 
Bunun iki onemli nedeni uygulamali testlerin goktan segmeli testlere gore daha kisa 
olu§lari ve de uygulamali test sorularmm olgillmesi istenen becerilerle direk olarak 
ilgili olmayan bir§ok faktorlerin etkisine goktan segmeli sorulardan daha agik 
olu§laridir. Uygulama testleri ile gali§an psikometristler de diger testlerle gali§an 
meslekta§lari gibi Madde Tepki Modellerinin saglayacagi orneklem bagimliligi 
oldukga dil§iik olan madde ve ki§i istatistiklerine ihtiyag duymakta ve yukarida 
sayilan zorluklari a§abilecek yeni modellerin geli§tirilmesini beklemektedir. 

Amag: Ikincil degi§kenleri model hesaplamalarma yordayici olarak dahil etmeye izin 
veren Agiklayici Madde Tepki Modelleri birgok farkli ortamda uygulanan bir gok 
testin madde ve ki§i istatistiklerinin kalitesinin arttirilmasmda kullanilmaktadir. 
Ancak bu modellerin uygulamali testlerde kullanildiklarmda sikga kar§ila§ilan 
du§iik model uygunlugu ve du§iik giivenirlik problemlerini gozmede kullamlmasi 
ile ilgili bir gali§ma heniiz yapilmami§tir. Bu gali§mamn amaci Madde Tepki 
Modelleri kullanildigmda veriye uygunluk indeksleri dii§uk §ikan alti adet interaktif 
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uygulamali madde igeren bir uygulama testi igin Agiklayici Madde Tepki 
Modellerinin iyi bir alternatif olup olmadigiru degerlendirmekti. 

Yontem: Bu gali§manm orneklemi ara§tirmaya konu olan uygulamali CCS (Computer 
Case Simulations) testini alan 767 ki§inin alti uygulama sorusuna verdigi 
cevaplardan olu§maktadir. CCS Amerika'da gali§ma lisansi almaya hak 
kazanabilmek igin hekim adaylarmm aldiklari tic; a§amali bir testin, ugiincii ve son 
a§masmda verilen bir uygulama testidir. Hekim adaylari bu son a§amada goktan 
segmeli bir testin yam sira bu uygulama testini de alrrlar. Smav sirasmda, her bir CCS 
uygulamasi igin hekim adaylarma bilgisayar ortammda bir hasta profili verilir. 
Hekim adaylari uygun olduklarmi du§undukleri te§his ve takipleri interaktif bir 
ortamda yapabilmektedir. Her bir CCS igin hekim adaylari maksimum 25 dakika 
harcayabilir. Bu gali§mada orneklemdeki ki§iler her uygulama sorusundaki 
performanslari igin yanlif uygulamaya 0, kismi dogru uygulamaya 1 ve dogru 
uygulamaya 2 puanla puanlanmi§tir. 

Kismi puanlama kullanildigi igin, Kismi Puanlama Madde Tepki Modelled (Partial 
Credit Modeling) ile hesaplanan be§ ayri model kullarulmi§tir. Ilk model higbir 
yordayici degi§ken olmadan, yard geleneksel kismi puanlama Madde Tepki 
Modelled ile hesaplanmi§tir. Ikinci model uygulama sorusunun sirasi, ugiincu model 
uygulama sorusuna ne kadar zaman harcandigi, dordiincil model hekim adayirun 
cinsiyeti ve be§inci model hekim adayirun son a§ama smavinm goktan segmeli 
sorulardan olu§an kismmdan aldigi puam yordayici olarak kullanarak 
hesaplanmi§tir. Her yordayicmm faydaliligmi test etmek igin her bir Agiklayici 
Madde Tepki Modeli igin hesaplanan veriye uygunluk indeksleri geleneksel Madde 
Tepki Modeli igin hesaplanan indeksleri ile kar§ila§tmlmi§tir. 

Bulgular: Model uygunluk indeksleri goktan segmeli boliimden alman test puanmin 
iyi bir yordayici oldugunu gostermektedir. Uygulama sorusunun hangi sirayla 
cevaplandigi, uygulama sorusuna harcanan toplam zaman ve hekim adaymin 
cinsiyeti yordayici olarak faydali bulunmami§tir. Karfila§tinldigmda Madde Tepki 
Modeli ve goktan segmeli test puam ile hesaplanan Agiklayici Madde Tepki Modelli 
ile hesaplanan madde e§ik degerlerini kullanarak elde edilen figiirler agikga 
gostermektedir ki iyi bir yordayici ile kurulan bir Agiklayici Madde Tepki Modeli 
madde istatistikleri ile ki§ilerin beceri dtizeyleri arasmdaki fonksiyonel ili§kiyi iyi 
yonde degi§tirebilecektir. 

Oneriler: Uzmanlar ki§ilerin bilgi ve becerilerini ortaya koyabilecekleri uygulama 
smavlarmm, goktan segmeli smavlara birgok bakimdan iistun oldugunu du§iinurler. 
Ancak uygulama smavlari ile elde edilen test puanlarmm gilvenirligi goktan segmeli 
smavlarla kar§ila§tirildigmda genellikle dil§uktur. Test giivenirligini arttirmanm en 
olagan yolu olan madde sayismi arttirma uygulama smavlari igin gok kolay 
olmamaktadrr. Uygulama sorularmi geli§tirmek, uygulamak ve puanlamak oldukga 
emek yogun ve pahali olabilmektedir. Test maddeleri artirilamiyorsa, bir alternatif 
uygulama elde bulunan ek verilerin yapilan model tahminlerinde kullanilmasi 
olabilir. Bu gah§ma boylesi bir yakla§imla yapilmi§trr. 
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Bulgular gostermektedir ki geleneksel Madde Tepki Modeli uygulandigmda kabul 
edilebilir veriye uygunluk indeksleri ve guvenilir madde istatistikleri elde etmede 
gtlgluk geken uygulama testleri Agiklayici Madde Tepki Modellerinin 
uygulamalarmdan yararlanabilir. Bu ara§tirmaya konu olan CCS uygulama testi igin 
alman sonuglar gostermektedir ki ikincil degifkenlerin saglayacagi ek bilgi, bu bilgi 
olmadan elde edilecek tahminleri iyi yonde degi§tirecektir. Elbette Agiklayici Madde 
Tepki Model'inin ba§arili olmasi igin ikincil verilerin elde bulunmasi ve modele 
eklenmesi bajli bajma yeterli olmayacaktir. Bu ikincil degi§kenlerin katkismm ne 
olacagi bu ara§tirmada da kullanilan a§amali bir yakla§im ile ayri ayn 
degerlendirilmelidir. Agiklayici Madde Tepki Model uygulamalari kullanicilara 
farkli model geli§tirme imkani da sunmaktadir. Ornegin, ara§tirmacilar, eldeki 
veriler uygun oldugunda, birden fazla ikincil degi§kenin de dahil edilebilecegi 
alternatif modeller ile interaksiyon ihtimallerini de kolayca gali§abilirler. 

Anahtar Sozcukler : Kismi Puan Modeli, Madde Tepki Modeli, 
uygulama testleri, madde istatistikleri, bajan tahmini 


APPENDIX A. SAS SYNTAX 
/* Read in*/ 
data CCS; 

infile "H:\CCS\ DATA\SASdatalN.dat"; 

INPUT per index y3 II12 13 14 15 16 niseq time MC male; 

RUN; 

/* Estimate*/ 

/* Model 1 - PCM no covariates, CCS data - PCM three categories 0-2, */ 

PROC NLMIXED data=CCS method=gauss technique=quanew noad qpoints=10; 
PARMS bl_l-bl_6=0 b2_l-b2_6=0 sd=0.5; 
betal=bl_l*Il+bl_2*I2+bl_3*I3+bl_4*I4+bl_5*I5+bl_6*I6; 
beta2=b2_l*Il+b2_2*I2+b2_3*I3+b2_4*I4+b2_5*I5+b2_6*I6; 
expl=exp(theta-betal); 
exp2=exp(2*theta-betal-beta2); 
denom=l+expl+exp2; 
if (y3=0) then p=l/denom; 
else if (y3=l) then p=expl/denom; 
else if (y3=2) then p=exp2/denom; 
if (p>le-8) then ll=log(p); 
else ll=-lel00; 

Model y3~general(ll); 
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RANDOM theta~normal(0,sd**2)subject=per; 

ESTIMATE 'sd**2' sd**2; 

RUN; 

/* Model 2 - PCM with item sequence covariate, CCS data - PCM three categories: 0-2 */ 
PROC NLMIXED data=CCS method=gauss technique=quanew noad qpoints=10; 
PARMS bl_l-bl_6=0 b2_l-b2_6=0 ts=0 sd=0.5; 
theta=eps+ts*niseq; 

betal=bl_l*Il+bl_2*I2+bl_3*I3+bl_4*I4+bl_5*I5+bl_6*I6; 
beta2=b2_l*Il+b2_2*I2+b2_3*I3+b2_4*I4+b2_5*I5+b2_6*I6; 
expl=exp(theta-betal); 
exp2=exp(2*theta-betal-beta2); 
denom=l+expl+exp2; 
if (y3=0) then p=l/denom; 
else if (y3=l) then p=expl/denom; 
else if (y3=2) then p=exp2/denom; 
if (p>le-8) then ll=log(p); 
else ll=-lel00; 

Model y3~general(ll); 

RANDOM eps~normal(0,sd**2)subject=per; 

ESTIMATE 'sd**2' sd**2; 

RUN; 

/* Model 3 - PCM with response time covariate, CCS data - PCM three categories: 0-2 */ 
PROC NLMIXED data=CCS method=gauss technique=quanew noad qpoints=10; 
PARMS bl_l-bl_6=0 b2_l-b2_6=0 ti=0 sd=0.5; 
theta=eps+ti*time; 

betal=bl_l*Il+bl_2*I2+bl_3*I3+bl_4*I4+bl_5*I5+bl_6*I6; 
beta2=b2_l*Il+b2_2*I2+b2_3*I3+b2_4*I4+b2_5*I5+b2_6*I6; 
expl=exp(theta-betal); 
exp2=exp(2*theta-betal-beta2); 
denom=l+expl+exp2; 
if (y3=0) then p=l/denom; 
else if (y3=l) then p=expl/denom; 
else if (y3=2) then p=exp2/denom; 
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if (p>le-8) then ll=log(p); 
else ll=-lelOO; 

Model y3~general(ll); 

RANDOM eps~normal(0,sd**2)subject=per; 

ESTIMATE 'sd**2' sd**2; 

RUN; 

/* Model 4 - PCM with gender covariate: male coded as 1, CCS data, PCM three 
categories: 0 - 2 */ 

PROC NLMIXED data=CCS method=gauss technique=quanew noad qpoints=10; 

PARMS bl_l-bl_6=0 b2_l-b2_6=0 g=0 sd=0.5; 
theta=eps+g*male; 

betal=bl_l*Il+bl_2*I2+bl_3*I3+bl_4*I4+bl_5*I5+bl_6*I6; 
beta2=b2_l*Il+b2_2*I2+b2_3*I3+b2_4*I4+b2_5*I5+b2_6*I6; 
expl=exp(theta-betal); 
exp2=exp(2*theta-betal-beta2); 
denom=l+expl+exp2; 
if (y3=0) then p=l/denom; 
else if (y3=l) then p=expl/denom; 
else if (y3=2) then p=exp2/denom; 
if (p>le-8) then ll=log(p); 
else ll=-lel00; 

Model y3~general(ll); 

RANDOM eps~normal(0,sd**2)subject=per; 

ESTIMATE 'sd**2' sd**2; 

RUN; 

/* Model 5 - PCM with MCT Scores as a person covariate, CCS data - PCM three 
categories: 0-2, */ 

PROC NLMIXED data=CCS method=gauss technique=quanew noad qpoints=10; 

PARMS bl_l-bl_6=0 b2_l-b2„6=0 t=0 sd=0.5; 
theta=eps+t*MC; 

betal=bl_l*Il+bl_2*I2+bl_3*I3+bl_4*I4+bl_5*I5+bl_6*I6; 

beta2=b2_l*Il+b2_2*I2+b2_3*I3+b2_4*I4+b2_5*I5+b2_6*I6; 

expl=exp(theta-betal); 

exp2=exp(2*theta-betal-beta2); 
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denom=l+expl+exp2; 
if (y3=0) then p=l/denom; 
else if (y3=l) then p=expl/denom; 
else if (y3=2) then p=exp2/denom; 
if (p>le-8) then ll=log(p); 
else ll=-lel00; 

Model y3~general(ll); 

RANDOM eps~normal(0,sd**2)subject=per; 
ESTIMATE 'sd**2' sd**2; 

RUN; 



